A Hybrid Supervised Machine Learning Classifier System for Breast Cancer Prognosis Using Feature Selection and Data Imbalance Handling Approaches

نویسندگان

چکیده

Nowadays, breast cancer is the most frequent among women. Early detection a critical issue that can be effectively achieved by machine learning (ML) techniques. Thus in this article, methods to improve accuracy of ML classification models for prognosis are investigated. Wrapper-based feature selection approach along with nature-inspired algorithms such as Particle Swarm Optimization, Genetic Search, and Greedy Stepwise has been used identify important features. On these selected features popular classifiers Support Vector Machine, J48 (C4.5 Decision Tree Algorithm), Multilayer-Perceptron (a feed-forward ANN) were system. The methodology proposed system structured into five stages which include (1) Data Pre-processing; (2) imbalance handling; (3) Feature Selection; (4) Machine Learning Classifiers; (5) classifier’s performance evaluation. dataset under research experimentation referred from UCI Repository, named Breast Cancer Wisconsin (Diagnostic) Set. This article indicated decision tree classifier appropriate learning-based optimum prognosis. Optimization algorithm achieves 98.24%, MCC = 0.961, Sensitivity 99.11%, Specificity 96.54%, Kappa statistics 0.9606. It also observed Search 98.83%, 0.974, 98.95%, 98.58%, 0.9735. Furthermore, Multilayer Perceptron ANN 98.59%, 0.968, 98.6%, 98.57%, 0.9682.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Handling Class Imbalance Problem Using Feature Selection

1 Introduction The class imbalance problem is a challenge to machine learning and data mining, and it has attracted significant research recent years. A classifier affected by the class imbalance problem for a specific data set would see strong accuracy overall but very poor performance on the minority class. The imbalance data sets are pervasive in real-world applications. Examples of these ki...

متن کامل

A Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis

Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...

متن کامل

A Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis

Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...

متن کامل

A New Hybrid Feature Subset Selection Algorithm for the Analysis of Ovarian Cancer Data Using Laser Mass Spectrum

Introduction: Amajor problem in the treatment of cancer is the lack of an appropriate method for the early diagnosis of the disease. The chemical reaction within an organ may be reflected in the form of proteomic patterns in the serum, sputum, or urine. Laser mass spectrometry is a valuable tool for extracting the proteomic patterns from biological samples. A major challenge in extracting such ...

متن کامل

Supervised Feature Selection Based Extreme Learning Machine (sfs-elm) Classifier for Cyber Bullying Detection in Twitter

Cyber bullying detection that are prevailing commonly in social networks like Twitter is one of the focussed research area. Text mining and detecting cyber bullying has several research challenges and lot of research scope to work with. This research work makes use of supervised feature selection by ranking method in order to choose the features from the tweets. After that extreme learning mach...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Electronics

سال: 2021

ISSN: ['2079-9292']

DOI: https://doi.org/10.3390/electronics10060699